AITopics | br 1

Collaborating Authors

br 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sub-samplingforEfficientNon-Parametric BanditExploration

Neural Information Processing SystemsFeb-8-2026, 03:26:49 GMT

In this paper, we propose the first re-sampling based algorithm that is asymptotically optimal for several classes of possibly un-bounded parametric distributions.

artificial intelligence, logr, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Add feedback

Asynchronous Gathering of Opaque Robots with Mobility Faults

Pramanick, Subhajit, Jana, Saswata, Mandal, Partha Sarathi, Sharma, Gokarna

arXiv.org Artificial IntelligenceSep-16-2025

We consider the fundamental benchmarking problem of gathering in an $(N,f)$-fault system consisting of $N$ robots, of which at most $f$ might fail at any execution, under asynchrony. Two seminal results established impossibility of a solution in the oblivious robot (OBLOT) model in a $(2,0)$-fault system under semi-synchrony and in a $(3,1)$-Byzantine fault system under asynchrony. Recently, a breakthrough result circumvented the first impossibility result by giving a deterministic algorithm in a $(2,0)$-fault system under asynchrony in the luminous robot (LUMI) model using 2-colored lights. However, a breakthrough result established impossibility of gathering in a $(2,1)$-crash system in the LUMI model under semi-synchrony. In this paper, we consider a {\em mobility fault} model in which a robot crash only impacts it mobility but not the operation of the light. We establish four results under asynchrony in LUMI with the mobility fault model. We show that it is impossible to solve gathering in a $(2,1)$-mobility fault system using 2-colored lights, and then give a solution using 3-colored lights, which is optimal w.r.t. the number of colors. We then consider an $(N,f)$-mobility fault system, $f

algorithm, artificial intelligence, robot, (16 more...)

arXiv.org Artificial Intelligence

2509.10711

Country: Europe (0.67)

Genre:

Research Report > New Finding (0.74)
Research Report > Promising Solution (0.48)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

A Dynamic Stackelberg Game Framework for Agentic AI Defense Against LLM Jailbreaking

Han, Zhengye, Zhu, Quanyan

arXiv.org Artificial IntelligenceJul-14-2025

As large language models (LLMs) are increasingly deployed in critical applications, the challenge of jailbreaking, where adversaries manipulate the models to bypass safety mechanisms, has become a significant concern. This paper presents a dynamic Stackelberg game framework to model the interactions between attackers and defenders in the context of LLM jailbreaking. The framework treats the prompt-response dynamics as a sequential extensive-form game, where the defender, as the leader, commits to a strategy while anticipating the attacker's optimal responses. We propose a novel agentic AI solution, the "Purple Agent," which integrates adversarial exploration and defensive strategies using Rapidly-exploring Random Trees (RRT). The Purple Agent actively simulates potential attack trajectories and intervenes proactively to prevent harmful outputs. This approach offers a principled method for analyzing adversarial dynamics and provides a foundation for mitigating the risk of jailbreaking.

large language model, natural language, player 2, (18 more...)

arXiv.org Artificial Intelligence

2507.08207

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.69)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Is Bellman Equation Enough for Learning Control?

You, Haoxiang, Molu, Lekan, Abraham, Ian

arXiv.org Artificial IntelligenceMar-5-2025

The Bellman equation and its continuous-time counterpart, the Hamilton-Jacobi-Bellman (HJB) equation, serve as necessary conditions for optimality in reinforcement learning and optimal control. While the value function is known to be the unique solution to the Bellman equation in tabular settings, we demonstrate that this uniqueness fails to hold in continuous state spaces. Specifically, for linear dynamical systems, we prove the Bellman equation admits at least $\binom{2n}{n}$ solutions, where $n$ is the state dimension. Crucially, only one of these solutions yields both an optimal policy and a stable closed-loop system. We then demonstrate a common failure mode in value-based methods: convergence to unstable solutions due to the exponential imbalance between admissible and inadmissible solutions. Finally, we introduce a positive-definite neural architecture that guarantees convergence to the stable solution by construction to address this issue.

bellman equation, br 1, equation, (15 more...)

arXiv.org Artificial Intelligence

2503.02171

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Energy (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Leveraging Noisy Observations in Zero-Sum Games

Athanasakos, Emmanouil M, Perlaza, Samir M

arXiv.org Machine LearningFeb-5-2024

This paper studies an instance of zero-sum games in which one player (the leader) commits to its opponent (the follower) to choose its actions by sampling a given probability measure (strategy). The actions of the leader are observed by the follower as the output of an arbitrary channel. In response to that, the follower chooses its action based on its current information, that is, the leader's commitment and the corresponding noisy observation of its action. Within this context, the equilibrium of the game with noisy action observability is shown to always exist and the necessary conditions for its uniqueness are identified. Interestingly, the noisy observations have important impact on the cardinality of the follower's set of best responses. Under particular conditions, such a set of best responses is proved to be a singleton almost surely. The proposed model captures any channel noise with a density with respect to the Lebesgue measure. As an example, the case in which the channel is described by a Gaussian probability measure is investigated.

payoff, player 1, player 2, (16 more...)

arXiv.org Machine Learning

2402.02861

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > France > Provence-Alpes-Côte d'Azur (0.04)
(8 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Smoothed Online Learning is as Easy as Statistical Learning

Block, Adam, Dagan, Yuval, Golowich, Noah, Rakhlin, Alexander

arXiv.org Machine LearningFeb-9-2022

Much of modern learning theory has been split between two regimes: the classical \emph{offline} setting, where data arrive independently, and the \emph{online} setting, where data arrive adversarially. While the former model is often both computationally and statistically tractable, the latter requires no distributional assumptions. In an attempt to achieve the best of both worlds, previous work proposed the smooth online setting where each sample is drawn from an adversarially chosen distribution, which is smooth, i.e., it has a bounded density with respect to a fixed dominating measure. We provide tight bounds on the minimax regret of learning a nonparametric function class, with nearly optimal dependence on both the horizon and smoothness parameters. Furthermore, we provide the first oracle-efficient, no-regret algorithms in this setting. In particular, we propose an oracle-efficient improper algorithm whose regret achieves optimal dependence on the horizon and a proper algorithm requiring only a single oracle call per round whose regret has the optimal horizon dependence in the classification setting and is sublinear in general. Both algorithms have exponentially worse dependence on the smoothness parameter of the adversary than the minimax rate. We then prove a lower bound on the oracle complexity of any proper learning algorithm, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning. Finally, we apply our results to the contextual bandit setting to show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits in the case that contexts arrive in a smooth manner.

algorithm, complexity, oracle call, (16 more...)

arXiv.org Machine Learning

2202.0469

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time

Wang, Weichen, Han, Jiequn, Yang, Zhuoran, Wang, Zhaoran

arXiv.org Machine LearningAug-16-2020

Reinforcement learning is a powerful tool to learn the optimal policy of possibly multiple agents by interacting with the environment. As the number of agents grow to be very large, the system can be approximated by a mean-field problem. Therefore, it has motivated new research directions for mean-field control (MFC) and mean-field game (MFG). In this paper, we study the policy gradient method for the linear-quadratic mean-field control and game, where we assume each agent has identical linear state transitions and quadratic cost functions. While most of the recent works on policy gradient for MFC and MFG are based on discrete-time models, we focus on the continuous-time models where some analyzing techniques can be interesting to the readers. For both MFC and MFG, we provide policy gradient update and show that it converges to the optimal solution at a linear rate, which is verified by a synthetic simulation. For MFG, we also provide sufficient conditions for the existence and uniqueness of the Nash equilibrium.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Machine Learning

2008.06845

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Asia > China (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback